Feature screening algorithm for high dimensional data

نویسندگان

چکیده

Currently, feature screening is becoming an important topic in the fields of machine learning and high-dimensional data analysis. Filtering out irrelevant features from a set variables considered to be preliminary step that should performed before any Many approaches have been proposed same after work Fan Lv (J. Royal Stat. Soc., Ser. B. 70 (5), 849–911 (2008)), who introduced sure property. However, performance these methods differs one paper another. In this work, we aim add list new algorithm performing inspired by Kendall interaction filter Appl. 50 (7), 1496–1514 (2020)) when response variable continuous. The good behavior our proved through comparison with existing method, under several simulation scenarios.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach

Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...

متن کامل

Constrained Based Feature Subset Selection Algorithm for High Dimensional Data

Feature Selection is to selecting the useful features from the original dataset for improve the more accurate results. Constrained Based Feature Subset Selection(CFSS) Algorithm Removes irrelevant and redundant features. This method is to find a similarity computation based on the entropy and conditional entropy values. After computing similarity computation to applied Approximate Relevancy(AR)...

متن کامل

Feature Selection for High-dimensional Integrated Data

Motivated by the problem of identifying correlations between genes or features of two related biological systems, we propose a model of feature selection in which only a subset of the predictors Xt are dependent on the multidimensional variate Y , and the remainder of the predictors constitute a “noise set” Xu independent of Y . Using Monte Carlo simulations, we investigated the relative perfor...

متن کامل

Feature selection for high-dimensional industrial data

In the semiconductor industry the number of circuits per chip is still drastically increasing. This fact and strong competition lead to the particular importance of quality control and quality assurance. As a result a vast amount of data is recorded during the fabrication process, which is very complex in structure and massively affected by noise. The evaluation of this data is a vital task to ...

متن کامل

Variable Screening in High-dimensional Feature Space

Variable selection in high-dimensional space characterizes many contemporary problems in scientific discovery and decision making. Fan and Lv [8] introduced the concept of sure screening to reduce the dimensionality. This article first reviews the part of their ideas and results and then extends them to the likelihood based models. The techniques are then applied to disease classifications in c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Mathematical modeling and computing

سال: 2023

ISSN: ['2312-9794', '2415-3788']

DOI: https://doi.org/10.23939/mmc2023.03.703